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We derive exact solutions of simplified models for the temporal 
evolution of the protein concentration within a cell population ar- 
bitrarily far from the stationary state. We show that monitoring 
the dynamics can assist in modeling and understanding the nature 
of the noise and its role in gene expression and protein production. 
We introduce a new measure, the cell turnover distribution, which 
can be used to probe the phase of transcription of DNA into mes- 
senger RNA. 

Advances in experimental techniques, that enable the direct observation 
of gene expression in individual cells, have demonstrated the importance of 
stochasticity in gene expression, the translation into proteins of the informa- 
tion encoded within DNA.^"^ Such variability can lead to deleterious effects in 
cell function and cause diseases.^ On the positive side, stochasticity in gene ex- 
pression confers on cells the ability to be responsive to unexpected stresses and 
may augment growth rates of bacterial cells compared to homogeneous pop- 
ulations.'' Disentangling the various contributions to production fluctuations 
is further complicated by the recent finding that different stochastic processes 
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yield the same response in the variance in protein abundance at stationarity.^ 
A population of isogenic cells growing under the same environmental conditions 
can exhibit protein abundances that vary greatly from cell to cell. The sources 
of variability have been identified at multiple levels, with transcription and 
translation playing a major role under certain circumstances. ^^"^^ 

The low concentration of reactants potentially has two important conse- 
quences: the first is that fluctuations around the mean can be large; the sec- 
ond is that the nature of the stochastic noise should be taken into account 
in some detail because one may not simply invoke the central limit theorem^^ 
which leads to the universal and ubiquitous Gaussian noise. Thus, two genes 
expressed at the same average abundance can produce populations with dif- 
ferent phenotypic noise strengths, defined as the ratio of the variance over 
the mean value of the number of proteins. We show here that two distinct 
models, one taking into account the detailed nature of the noise and the other 
following from an application of the central limit theorem, yield exactly the 
same stationary solution for the distribution of proteins in isogenic cells under 
the same environmental conditions. The exact dynamical solution of these 
two simplified models demonstrate the value of monitoring the dynamics for 
understanding the nature of the noise in a cell. 

We make the simplified assumption that the kinetics of gene expression can 
be described approximately by four rate constants: ki and k2 are the transcrip- 
tion and translation rates, respectively and 71 and 72 are the degradation rates 
for mRNA and proteins, respectively. It has been found experimentally that 
proteins are produced in bursts^' with an exponential distribution of the 
number of proteins produced in a given event. Following Paulsson et al.'^^ and 
Friedman et al.,^^ we will assume that transcription pulses are Poisson events 

and that the probability distribution that in a single event / > proteins 

——I 

are produced, w{I), is approximated by w{I) — ^=2 , where A;2/7i is the 
translation efficiency, i.e. the mean number of proteins produced in a given 
burst. Here we consider a simple model for the production of proteins without 
memory and aging of molecules. Using the specific burst distribution given 
above allows one to obtain the shape of the protein distribution even far from 
stationarity. Under these assumptions, the stochastic equation that governs 
the single-variable dynamics of gene expression, can be written as 

x{t)^5-j2x{t)+A{t). (1) 
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Figure 1: The stationary distribution of proteins in a prokaryotic cell popula- 
tion taken from Ref.^^ fitted to Eq. ([3]) with 6 = or 5/72 = 60.3 (dashed). 
The best fit parameters are 7i//i;2 = 0.038, ^1/72 = 12.88 {x^ ~ 6100) and 
7i/^2 = 0.030, /ci/72 = 8.33 (x^ — 8700), respectively. From the experimental 
data it is hard to distinguish between the steady state distributions predicted 
by Eq. ^ with 5 = and 5 > 0. 

This pseudo-equation describes the real-time stochastic evolution of gene 
expression through a deterministic part and a stochastic term A(t), which 
will be defined later on. Here x is a continuous variable that represents the 
number of proteins within a cell. 5 is a term added for generality which can 
be incorporated in the average noise. 

In order to understand the nature of the noise for the gene expression case, 
let us consider the random variable, Ik, that is a measure of the number of 
proteins in the k^^ transcription event, where k = 1,2, ... ,n. A key quantity 
of interest is X]fc=i -^fc = A{t)At where n{t), the number of events in the time 
interval {t,t + At), is a random variable independent of both x and the J^s. 
As in the experiment, let us postulate that: i) the J^s are independent and 
identically distributed with exponential distribution; and ii) the probability 
of n events occurring during the time interval At is given by the Poisson 
distribution g„(At) = (fciAt)" exp(-A;iAt)/n!. The distribution of A{t) that 
we use in Eq. ([T]) can be explicitly calculated^^ and leads to the following 
expression for the cumulants: ((A(ti) • ■ • A(t„))) = nlki (^^2/71)"' nr=2 ^('^« ~ '^1) 
for n > 2 and (A(t)) = kik2/'yi, independent of time. Because the cumulants 
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are delta functions, the noise is still white (events are uncorrelated if they occur 
at different times); however the noise is no longer Gaussian because cumulants 
with n greater than two are non-zero. 

The master equation that describes this burst-like process is^^ 

+ ki w{x - y)p{y,t)dy - kip{x,t) , (2) 
Jo 

where p{x,t) = p{x,t\xQ,0) is the conditional probability that the protein 
concentration has a value x at time t given that it has a value xo at time 0; 
and w{x) = ''2^. The stationary solution of this model (with 5 = 0) was 
first obtained by Paulsson et al.^^ and subsequently re-derived by Friedman 
et al.^^ For arbitrary 5 > 0, we find the stationary solution is 

where Q{x) is the step function equal to 1 when x > and zero otherwise. 
This distinctive feature is a sharp signature of the nature of the noise even in 
the stationary solution but is present only when 5 7^ 0. However, as shown 
in the fit to the stationary solution in Fig. ([T]), the singularity, if it exists, is 
easily masked by other noise effects leading to a rounding effect. 

Although experiments on gene expression^' are consistent with a burst- 
like protein production, steady-state distributions of protein abundances are 
equally compatible with alternative explanations. In fact, because mRNA is 
unstable compared to protein lifetime (71 ^ 72), one can assume that tran- 
scripts give rise to a constant flux of proteins / and subsequently any protein 
degrades at a constant rate 72. Because of the great amount of available 
molecules, one can apply the central limit theorem and suppose that the am- 
plitude of fluctuations is simply proportional to y/x. Within this framework 
there is no burst-like production, nevertheless the stationary solutions that 
one obtains for a burst-like process, including that of the extended autoregula- 
tion model, ^^'^^ are also obtained in models with appropriately chosen random 
multiplicative Gaussian noise. Within this scenario the stochastic evolution 
of the protein concentration x{t) is governed by the equation 



x{t) = / - 72 x{t) + ^/Dx{t)r]{t), (4) 
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Figure 2: Protein distribution dynamics for different types of noise and with 
the same initial conditions, i.e. Xq = proteins at t = 0. The dashed curve 
is for the multiphcative Gaussian noise, i.e. Eq. (jl]) with / = fci/i;2/7i and 
D = 72^2/71;^^ whereas the other curve is for the non-Gaussian noise, i.e. for 
Eq. ([5]). In both cases the parameters are 5 = 0, 72/-D = 71/^2 = 0.038, 
f /D = A;i/72 = 12.88 and we have set 7^^ = 40 min, 7^"^ = 2 min. 



where 7]{t) is a Gaussian white noise with autocorrelation {ri{t)ri{t')) = 26{t — 
t'). Note that the same equation could be obtained on setting (A(t)) = / and 
((A(t)A(t'))) = (A(t)A(t')) - (A(t))(A(t')) = 2Dx{t)6{t-t') in Eq. (P) with 
all higher order cumulants being identically zero. We point out that in ecology 
Eq. (jl]) is useful for studying the evolution of tropical forests, where the 
detailed nature of the stochastic noise is not important because of the rela- 
tively large numbers of trees of a given species. In the field of finance, Eq. (jlj) 
has been used to study the evolution of interest rates (the Cox-IngersoU-Ross 
modeP^), where analogous considerations on fluctuations can be made. On 
defining / = ^1^2/71 and D = 72^2/71, Eq. (jl]) yields the same stationary 
state as in Eq. (j2]) with 6 = 0, i.e. Eq. (j3]). The mean number of proteins at 
stationarity is ^1/^2/7172 and the phenotypic noise strength at stationarity is 
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^2/71? relations that are consistent with previous findings. In order to take 
into account the effects of feedback in a system undergoing auto-regulation, 
one can introduce the physically transparent modification / Dc{x), where 
c is a response function which can be modeled as having two distinct limiting 
values at zero and at infinity with the latter being smaller than the former. 
Even in this situation, we obtain the same stationary distribution with bista- 
bility as Friedman et al.^^ Despite this much more realistic analysis, the final 
stationary protein distribution is experimentally indistinguishable from Eq. 

with 6 = 0. Thus, a theoretical modeling of the stationary state of pro- 
tein production provides little insight into the microscopic nature of the noise 
that leads to stationarity. According to our approach, stochasticity in gene 
expression ensues from the large number of available components which en- 
tangle a lot of different mechanisms within a cell. Interestingly, this calls for 
effective mechanisms which can dampen the deleterious effects of protein noise. 
Such efficient noise-reducing mechanisms could be a combination of gestation 
and senescence, because of their ability to prevent fluctuations rather than 
correcting.*^ 

These results raise the question whether the agreement between the sta- 
tionary solutions of the theoretical models and experiments are in fact a direct 
probe of the nature of the microscopic noise and whether the asymmetric sta- 
tionary solutions derive from a careful consideration of the bursty nature of 
the noise. In order to circumvent the indistinguishability of steady-states, one 
can look into empirical protein abundances far from stationarity, for which we 
provide analytical formulas. Thus we turn now to a study of the dynamics of 
Eq. ([2]) which is a powerful probe of the noise effects. We have derived^^ the 
solution at arbitrary time. 



p{x,t) 



X exp 
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e^^*-l)(x-6) ,(5) 



where iFi (a, 6; x) is the confluent hypergeometric function^^ and ^(t) = XQe~"''^^+ 
^{1 — e~^'^^) is the solution of the deterministic part of the equation, i.e. with- 
out the noise. On using Eq. ([5]), one can calculate the phenotypic noise at 
any time, arbitrarily far from stationarity, and furthermore one can study its 
behavior starting from an arbitrary initial amount of proteins. Note that 5 
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enters only through C,{t) and the temporal evolution scales are determined by 
the transcription rate and the degradation rate of proteins. Interestingly, one 
obtains a distribution of proteins with a cut off along the interval [0, C,t) at 
any time whenever S > 0. By exploiting the exact dynamical solution one can 
study the evolution of the phenotypic noise strength in finer detail and probe 
the experimental consequences of the burst process hypothesis (Fig. ([2])). 

A measurable quantity that directly probes the protein distribution and its 
temporal evolution is the cell-turnover distribution (CTD) denoted by V{p, t) 
and defined as the probability that at time t the ratio x(t)/x(0) is equal to 
p, where x{t) and x(0) are the number of proteins within an isogenic cell 
population at time t > and t = 0, respectively. This quantity can be defined 
both close to and far from stationarity:^^ if the initial distribution of the 
proteins within the cell population is in the steady state given by Eq. ^ with 
5 = 0, then at a subsequent time t > the CTD is 

V72/ [i + eT2*(p-e-^2*)]^+^ 

where 2-^1(0, c; x) is the standard hypergeometric function. Thus, accord- 
ing to Eq. ([6]), under the burst process hypothesis we predict that i) the CTD 
vanishes between and e"'''^* even though the system is at stationarity, an 
effect which ought to be detectable for time scales less than or of the order 
of 1/72, ii) the CTD depends only on ki and 72 but not on translational ef- 
ficiency and other rates, in) at very large time separation there is only one 
free parameter, the ratio fci/72, and the CTDs predicted by the Gaussian and 
non-Gaussian noises become the same (fig. ([3])).^^ 

The analogous time dependent solutions for the Gaussian white noise can be 
compared with Eqs. ([5]) and ([6]).^^ The closer the system is to its steady-state, 
the more difficult it is to distinguish among the effects of gestation, senescence 
and burst-like production. Thus an experimental protocol capable of analyzing 
the cell population and its time evolution with different initial conditions would 
be helpful to disentangle the nature of stochastic noise. At early times, the 
evolution of the distribution is strongly affected by the specific mechanisms 
involved in the dynamics. At this stage, different distributions of waiting times 
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Figure 3: Cell Turnover Distribution (CTD). The dashed curve is for the 
Gaussian noise case, i.e. the CTD is calculated assuming that the governing 
equation is Eq. (jlj) with / = A;iA;2/7i and D = 72^2/71; whereas the other 
curve is for the non-Gaussian noise case, i.e. for Eq. ([6]). In this case the arrow 
indicates the cut-off point. Note, however, that extrinsic noise could tend to 
smooth out the discontinuity. In both cases /ci/72 = 12.88 and we have set 
7^^ = 40 min. 



between events or burst sizes produce non-stationary distributions that are 
very different, and the distinctive effects of noise, deterministic driving forces 
or coupling of degrees of freedom can be elucidated. Different conditions at 
initial times propagate into the early temporal evolution in strongly different 
ways according to the different effects of involved mechanisms, but inexorably 
lead to the same distribution for large time separation. 

We are grateful to Sunney Xie and his group for useful correspondence. 
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